Goto

Collaborating Authors

 learning margin halfspace


A Near-optimal Algorithm for Learning Margin Halfspaces with Massart Noise

Neural Information Processing Systems

We study the problem of PAC learning \gamma -margin halfspaces in the presence of Massart noise. Without computational considerations, the sample complexity of this learning problem is known to be \widetilde{\Theta}(1/(\gamma 2 \epsilon)) . Prior computationally efficient algorithms for the problem incur sample complexity \tilde{O}(1/(\gamma 4 \epsilon 3)) and achieve 0-1 error of \eta \epsilon, where \eta 1/2 is the upper bound on the noise rate.Recent work gave evidence of an information-computation tradeoff, suggesting that a quadratic dependence on 1/\epsilon is required for computationally efficient algorithms. Our main result is a computationally efficient learner with sample complexity \widetilde{\Theta}(1/(\gamma 2 \epsilon 2)), nearly matching this lower bound. In addition, our algorithm is simple and practical, relying on online SGD on a carefully selected sequence of convex losses.


Information-Computation Tradeoffs for Learning Margin Halfspaces with Random Classification Noise

Diakonikolas, Ilias, Diakonikolas, Jelena, Kane, Daniel M., Wang, Puqian, Zarifis, Nikos

arXiv.org Artificial Intelligence

We study the problem of PAC learning $\gamma$-margin halfspaces with Random Classification Noise. We establish an information-computation tradeoff suggesting an inherent gap between the sample complexity of the problem and the sample complexity of computationally efficient algorithms. Concretely, the sample complexity of the problem is $\widetilde{\Theta}(1/(\gamma^2 \epsilon))$. We start by giving a simple efficient algorithm with sample complexity $\widetilde{O}(1/(\gamma^2 \epsilon^2))$. Our main result is a lower bound for Statistical Query (SQ) algorithms and low-degree polynomial tests suggesting that the quadratic dependence on $1/\epsilon$ in the sample complexity is inherent for computationally efficient algorithms. Specifically, our results imply a lower bound of $\widetilde{\Omega}(1/(\gamma^{1/2} \epsilon^2))$ on the sample complexity of any efficient SQ learner or low-degree test.